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Abstract 

In this paper, Smale's a theory is generahzed to the context of intrinsic 
Newton iteration on geodesically complete analytic Riemannian and Her- 
mitian manifolds. Results are valid for analytic mappings from a manifold 
to a linear space of the same dimension, or for analytic vector fields on 
the manifold. The invariant 7 is defined by means of high order covari- 
ant derivatives. Bounds on the size of the basin of quadratic convergence 
are given. If the ambient manifold has negative sectional curvature, those 
bounds depend on the curvature. A criterion of quadratic convergence for 
Newton iteration from the information available at a point is also given. 

1 Introduction and main results. 

Numerical problems posed in manifolds arise in many natural contexts. Classical 
examples are given by the eigenvalue problem, the symmetric eigenvalue prob- 
lem, invariant subspace computations, minimization problems with orthogonality 
constraints, optimization problems with equality constraints ... etc. In the first 
example. Ax = Ax, the unknowns are the eigenvalue A G C and the eigenvector 
X G P„_i(C), the complex projective space consisting of complex vector lines 
through the origin in C". In the second example. Ax = Xx, A real and sym- 
metric, the unknowns are A G ffi and x G S""^, the unit sphere in M". In the 
third example the unknown is a /c— dimensional subspace contained in C" that 
is an element of the Grassmann manifold G„^fc(C). The fourth example involves 
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the orthogonal group, the special orthogonal group or the Stiefel manifold {nx k 
matrices with orthonormal columns). The last example leads to problems posed 
on submanifolds in M". 

For such or similar problems our objective is to design algorithms which re- 
spect their geometrical structure. We follow here the lines of the Geometric 
Integration Interest Group ( http:/ /www.focm.net/gi/J who showed the interest 
of such an approach. 

The first author's original motivation came from homogeneous and multi- 
homogeneous polynomial systems (Dedieu-Shub P) and also from a model for 
the human spine (Adler-Dedieu-Margulies-Martens-Shub P) with configuration 
space S'0(3)^^. A second motivation, for the second author, came from sparse 
polynomial systems of equations where the solutions belong to a certain toric 
variety (Malajovich-Rojas |T^). 

For such problems one often has to compute the solutions of a system of 
equations or to find the zeros of a vector field. For this reason we investigate 
here one of the most famous method to approximately solve these problems: the 
Newton method. 

In this paper, we investigate the local behavior of Newton's iteration close to a 
solution. While a lot is known about Newton's iteration in linear spaces (2j, little 
is known about intrinsic Newton's iteration in more general manifolds. Our main 
results here (Theorems II. HI to ll. (il below) extend Smale's a-theory to analytic Rie- 
mannian manifolds, a theory provides a criterion for the quadratic convergence 
of Newton's iteration in a neighborhood of a solution. This criterion depends on 
available data at the approximate solution. One important application (out of 
the scope of this paper) is the construction of rigorous homotopy algorithms for 
the solution of non-linear equations. 

More precisely, we will study quantitative aspects of Newton's method for 
finding zeros of mappings / : M„ — > and vector fields X : M„ TM„. Here 
M„ denotes a real complete analytic Riemannian manifold, TM„ its tangent 
bundle, / and X are analytic. We denote by T^M.^ the tangent space at z to 
M„, by {■,-)z the scalar product on T^M^ with associated norm H.H^, by d the 
Riemannian metric on M„ and by exp^ : T^M^ M„ the exponential map. This 
map is defined on the whole tangent bundle TM^ because M„ is assumed to be 
complete. We denote by > the radius of injectivity of the exponential map 
at z. Thus, exp^ : BT^{0,rz) i?M„(^,i"z) is one to one {B{u,r) is the open ball 
about u with radius r, B{u,r) is the closed ball). 

When M„ = the Newton operator associated with / is defined by 

Nf{z)=z-Df{z)-'f{z). 
In this context T^M" may be identified to M" and exp2(u) = z + u so that 

Nf{z)=e^vA-Df{z)-'f{z)). 
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This formula makes sense in the context of Riemannian manifolds and we define 
the Newton operator Nf : M„ — > M„ in this way. 

When, instead of a mapping M„ — > M" we consider a vector field X : M„ — >■ 
TM.n, in order to define Newton's method, we resort to an object studied in 
differential geometry; namely, the covariant derivative of vector fields. Let V 
denote the Levi-Civita connection on M„. For any vector fields X and Y on 
M„, Vx(F) is called the covariant derivative of Y with respect to X. Since V is 
tensorial in X the value of Vx (Y) at z E M„ depends only on the tangent vector 
u = X{z) G TzMn- For this reason we denote it 

(Vx(Y))(z)^DY{z)(u). 



It is a linear map 

DY{z) : T,M„ ^ TMn- 
The Newton operator for the vector field X is defined by 

Nx{z)^e^p,i-DX{z)-'X{z)). 

Notice this definition coincides with the usual one when X is a vector field in M" 
because the covariant derivative is just the usual derivative. 

In a vector space framework, Newton's method makes zeros of / with non- 
singular derivative correspond to fixed points of Nf and Newton sequences x^+i = 
Nf{xk), for an initial point xq taken close to such a fixed point (, converge 
quadratically to (. In this paper, our aim is to make these statements precise in 
our new geometric framework and to investigate quantitative aspects. We have 
in mind the following two theorems which are valid when M„ is equal to M" or 
in the more general context of an analytic mapping / : E ^ F between two real 
or complex Banach spaces: 

Theorem 1.1 (7-Theorem, Smale, 1986) Suppose that f{C) = and Df{C) 
is an isomorphism. Let 



7(/, z) = sup 

k>2 



Dm 



-iD'^fiz) 



i/fe-i 



k-cii< 



3-^ 



then the Newton sequence Zk — A^j*^^ {z) is defined for all k > and 



\^k-C\\<{^ 



2"-! 



k-CII- 
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For a proof see Blum-Cucker-Shub-Smale [2] Chap. 8, Theorem 1. The sec- 
ond theorem we want to extend to the context of Riemannian manifolds is the 
following: 

Theorem 1.2 (a— Theorem, Smale, 1986) Let 

(3{f,z) = \\Df{z)-'f{z)\\ 

and 

a{f,z)=Pif,zMf,z). 

We also let a{f,z) = oo when Df{z) is not invertible. There is a universal 
constant > with the following property: if a{f, z) < then there is a zero 
C of f such that Df{Q is an isomorphism and such that the Newton sequence 
{z) is defined for all k > and satisfies 

U-C\\<(^iy 'ik-CII- 

Moreover, the distance from z to the zero ( is at most 2f3{f, z). 

This second theorem is proved in Smale with the constant cto = 0.13071 . . . 
and Kim and ^7] for a one- dimensional version. 

1.1 Definitions and notations. 

In order to generalize these two results we have to define the corresponding in- 
variants in the context of Riemannian manifolds. The material contained in this 
section is classical in Riemannian geometry. The reader is refered to a textbook on 
this subject, for example: Dieudonne [7j, Do Carmo 0, Gallot-Hulin-Lafontaine 
[H], Helgason [12], O'Neill |2I]. 

Definition 1.1 (Tensors.) The space of p—contravariant and q— covariant an- 
alytic tensor fields 

T : r(M„)P X T*(M„)'' ^ J^(M„) 

is denoted by TJ'(M.n). An m— tuple of such tensor fields is called a vectorial 
tensor field and the space of vectorial tensor fields is denoted by 7^p(M„,M'"). 

Here T*(M„) is the cotangent bundle on M„ (the space of 1— forms) and 
JF(M„) the space of scalar analytic functions defined on M„. We let JF(M„) = 
7^°(M„). Let V denote the Levi-Civita connection on M„. For any vector field 
X and Y on M„, Vx(^) is called the covariant derivative of Y with respect to 
X. 
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Definition 1.2 (CovEiriant derivative for tensor fields.) Let X be a vector 
field on M„. For any integers p, q > and any tensor field T e 7^^(M„) the 
covariant derivative is defined by: 

• Vx(fi') = ^id) = Dg{X) the derivative of g along the vector field X when 
g is a function: g e 7^°(M„) 

• Vx(^) is given by the connection when Y is a vector field i.e. Y & 7^^(M„) 

• For a 1—form cu e 7^°(M„) its covariant derivative is the 1—form defined 
by 

VxM(l^) = X{u;{Y)) - u;{Wx{Y)) 
for any vector field Y. 

• For a tensor field T e 7^^(M„) the covariant derivative is the tensor field 
VxT e 7T'(M„) defined by 

VxT{uj' . . . a;^ . . . y,) = X{T{u^ ...uj^.Y^... Y,))- 

T{Vx{uj') ...u^Y^...Y,)-...- T{uj^ . . . a;^ Fi . . . V x{Y,)) 
for any 1— forms uj^ and vector fields Yj. 

• For a vectorial tensor field T e 7^p(M„,K'") 



( 






( VxTi \ 




) 







Definition 1.3 (Covariant fc— th derivative for tensor fields.) Let X be 

a vector field on M„. For any integers p, q > and any tensor fields T e 
TP{M.n,W^) the k—th covariant derivative is defined inductively by 

V\T = Vx i^x'T) . 

Since the covariant derivative is tensorial in X, its value at a given point 
z e M„ depends only on the vector X{z). For this reason, the following definition 
makes sense: 

Definition 1.4 (Covariant A;— th derivative for tensor fields at a point.) 

Let a point z e M„ and a vector u e 7^(M„) be given. Let X be a vector field such 
that X{z) — u. For any integers p, q > and any tensor field T e 7^^(M„, IR'") 
the value at z of the k—th covariant derivative is denoted by: 

D^T{z){u, ...,u)^ D^T{z)u^ = (VlT)(z). 

It defines a k— multilinear map 

D^T{z) : {{TMnY X {T*M^y)'' E™. 
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Definition 1.5 (Norm of a multilinear map.) Let 

M : (TMn)'' 
be a k— multilinear map. Its norm is defined by 

\\M\\^ = sup ||M(Mi, . . .,Uk)\\Rrn 

where the supremum is taken for all the vectors Uj G T^Mn such that \\uj\\z = 1. 

The following definition extends the definition of 7(/, z) to a Riemannian 
context. 



Definition 1.6 (Gamma.) Let a map f : M„ - 

M„ —>■ TM„ be given. For any point z G M„ we let 



lif, z) = sup 

k>2 



7(X, z) = sup 

k>2 



Df{z) 



DXiz 



_,D'f{z) 



k\ 



_-^D^X{z) 



^" and a vector field X 



k\ 



i/fc-i 



We also let 7(/, z) = oo when Df{z) is not invertible, idem for 7(X, z). 

This definition is justified by the definitions 11.41 and 11.51 When Df{z) is 
invertible then, by analyticity, 7(/, z) is finite. We also have to consider the 
following number related to the sectional curvature at C ^ 



Definition 1.7 For any ( G M„ 



Kr = sup 



\u — v\ 



where the supremum is taken for all z G -Bm„(C)I"c); '^''^d -u, f G TzM.n with \\u\\z 
and \\v\\z < r^), with r^ the radius of injectivity at (. 

Remark 1.1 • K/^ measures how fast the geodesies spread apart in M„. When 
u = or more generally when u and v are on the same line through 0, 

Therefore, we always have 

> 1. 
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• When Win has non-negative sectional curvature, the geodesies spread apart 
less than the rays (Do Carmo, ^8] Chap. V-2) so that 

(i(exp_j,('u), exp2(t>)) < Hm — tiH^ 

and consequently 

K^ = l. 

• Examples of manifolds with non-negative curvature are given by R", 
the unit sphere in M""*"^, P"(]R) the real projective space i.e the space of real 
vector lines m M""*"^ (f^, Chap. 8, Prop. 4-4) > IP"(C) the complex projective 
space i.e the space of complex vector lines in C^~^^ (l^, Chap. 8, Exerc. 
11), a Lie group with a bi-invariant metric (l^. Chap. 4i Exerc. 1), 0„ 
and §0„ the orthogonal and special orthogonal groups (Lie groups) . . . 



1.2 Main results for mappings. 

Our first main theorem relates the size of the quadratic attraction basin of a zero 
C of / to the invariants 7(/, Q and K,^. 

Theorem 1.3 (R— 7— theorem) Let f : M„ be analytic. Suppose that 

/(C) = and Df[Q is an isomorphism. Let 



Rif, = min r^, 



If d{z,() < R{fX) then the Newton sequence Zk = Nj^\z) is defined for all 
k > 0, and 

d{zk,0<(^l^ 

Remark 1.2 When M„ = R" equipped with the usual metric structure, the radius 
of injectivity r^ = 00 and Kc^ = 1. Thus, R{f,() = (3 — V7)/2^{f, Q cls in 
Theorem \l.l\ 

When M„ has non-negative sectional curvature, according to Remark \l.l\ one 
has = 1 and Theorem M.^ becomes 

Corollary 1.1 When M„ has non-negative sectional curvature, let f : M„ — > R"' 
be analytic. Suppose that f{() = and Df{() is an isomorphism. Let 

fl(/.C) = mm|r,.i^). 
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If d{z,() < R{f\C) then the Newton sequence Zk = Nj'\z) is defined for all 
k > 0, and 

d{zkX)< d{z,C). 

Theorem 11.31 has two interesting and immediate consequences: a lower esti- 
mate for the distance from other zeros and a lower estimate for the distance from 
the singular locus 

T.f = {zeMn : det Df{z) = 0}. 

Corollary 1.2 Suppose that f{Q = and -D/(C) is an isomorphism. Then, for 
any other zero C C one has 

d{CX)>R{fX)- 

Moreover, for any ^ G Sj the same inequality holds: 

d{z,0>R{fX)- 

Our second main theorem generalizes Theorem 11.21 We give sufficient condi- 
tions for z G M„ to be the starting point of a quadratically convergent Newton 
sequence. These conditions are given in terms of / at z, not in the behaviour of / 
in a neighborhood of z as in Kantorovich theory. We first need three definitions. 

Definition 1.8 The function ip{u) = 1— 4m + 2m^ is decreasing from 1 to when 
< M < 1 - V2/2. We denote by ao = 0.130716944 . . . the unique root of the 
equation 2u = ipiuY in this interval. 

Definition 1.9 a is the sum of the following series: 



k>0 

Definition 1.10 



cT = ^ ( - J = 1.632843018 . . . 



So = 7 ^ = 0.103621842 . . . 



1 + ^ 

ip(<jao) \ 1-o-Qo 



Definition 1.11 We let(3{f,z) = \\Df{z)-'f{z)\\, and a{f,z) = (3{f,z)^{f,z). 
We give to (3{f,z) and a{f,z) the value 00 when Df{z) is singular. 
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Theorem 1.4 (R-a-Theorem) Let / : M„ R" 6e analytic. Let z eM^ 
be such that 

P{f,z) < sor^ and a{f,z) < ao. 

Then the Newton sequence zq = z, z^+i = Nf{zk) is defined for all integers k >0 
and converges to a zero C, of f . Moreover, 

d{zk+i,Zk) < ( 2 J ^(/'^) 

and 

diC,z)<al3if,z). 

Remark 1.3 When M„ = M" is equipped with the usual metric structure, the ra- 
dius of injectivity r^ = oo and the first condition in Theorem \l.J\ is automatically 
satisfied. In this context Theorems M.^ and \1.4\ coincide. 

1.3 Main results for vector fields. 

The case of vector fields is treated similarly. As in Theorem 11.31 we have: 

Theorem 1.5 (R— 7— Theorem) Let X : M„ —>■ TM„ be an analytic vector 
field. Suppose that X{Q = and DX{C) is an isomorphism. Let 

( + JkI + AK^ + 2\ 

R(X, = mill rc, h ^ . 

^ '^^ 27(X,C) j 

If d{z,() < R{XX) then the Newton sequence z^ = N^^\z) is defined for all 
k > 0, and 

d{zkX)<(^l^ 

Like for mappings, Theorem 11.51 gives estimates for the distance from other 
zeros and a lower estimate for the distance from the singular locus 

^x = {zeMn : det DX{z) 

Corollary 1.3 Suppose that X{Q = and DX{() is an isomorphism. Then, 
for any other zero C' 7^ C one has 

d{C,0>R{x,0- 

Moreover, for any z G Sx the same inequality hold: 

d{z,o>R{x,c). 
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The invariants (3 and a are defined similarly: 



Definition 1.12 We let 



DX{z)~^X{z)\l 



and 



a(X,z)=/3(X,z)7(X,z). 
We give to z) and z) the value oo when DX{z) is singular. 



Theorem 1.6 (R-a-Theorem) Let X : M, 

field. Let z G be such that 



■71 



TM„ be an analytic vector 



P{X,z) < sqTz and a{X,z) < oq. 



Then the Newton sequence zq = z, Zk+i = Nx{zk) is defined for all integers k > 
and converges to a zero ( of X . Moreover, 



1.4 Previous work. 

There is quite a bit of previous work on such questions. The first to consider 
Newton's method on a manifold is Rayleigh 1899 [2^] who defined what we call 
today "Rayleigh Quotient Iteration" which is in fact a Newton iteration for a 
vector field on the sphere. Then, Shub 1986 27] defined Newton's method for the 
problem of finding the zeros of a vector field on a manifold and used retractions 
to send a neighborhood of the origin in the tangent space onto the manifold 
itself. In our paper we do not use general retractions but exponential maps. 
Independently of [23 ) Smith 1994 developed an intrinsic Newton's method 
and a conjugate gradient algorithm on a manifold using the exponential map. 
Also independently, Udriste 1994 [HH] studied Newton's method to find the zeros 
of a gradient vector field defined on a Riemannian manifold; Owren and Welfert 
1996 P3] defined Newton's iteration for solving the equation F{x) = where F 
is a map from a Lie group to its corresponding Lie algebra; Edelman- Arias-Smith 
1998 9j developed Newton's and conjugate gradient algorithms on the Grassmann 
and Stiefel manifolds. These authors define Newton's method via the exponential 
map as we do here. Shub 1993 [2H], Shub and Smale 1993-1996 j2i, EOl, EH, El, 
jSSl, see also, Blum-Cucker-Shub-Smale 1998 [5^, Malajovich 1994 [TH], Dedieu 
and Shub 2000 jH] introduce and study Newton's method on projective spaces 




and 



d{C,z)<aP{X,z). 
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and their products. Another important paper about this subject is Adler-Dedieu- 
Marguhes-Martens-Shub 2001 PP where quahtative aspects of Newton's method 
on Riemannian manifolds are investigated for both mappings and vector fields. 
This paper contains a nice application to a geometric model for the human spine 
represented as a 18— tuple of 3 x 3 orthogonal matrices. Recently Ferreira-Svaiter 
[TU] gave a Kantorovich-like theorem for Newton's method for vector fields defined 
on Riemannian manifolds. 



2 Parallel transport and Taylor's formula. 

In the proof sections of this paper, we frequently use parallel transport: 

Definition 2.1 (Parallel transport.) Let zq and z G M„ with z in the hall 
about Zo with radius r^^ the radius of injectivity. Then, there exists a unique 
geodesic curve c{t) in this ball such that c(0) = zq and c{T) = z for a certain T. 
In this context we denote by 

P,o,, : T,„M„ ^ TMn 

the parallel transport along this geodesic. It is an isometry which preserves the 
orientation when M„ is oriented. 

We now extend this concept to other objects 

Definition 2.2 (Parallel transport: extension.) 

• For a covector uOz^ G T*yM„ by 

for any Y, G TMn- 

• For a tensor field T G 7^^(M„) we denote by T^q its value at zq that is 

n^iiu;')., . . . . . . = Tiu' ...u^,Y,. . .Y,){zo) 

for any 1— forms uj"^ and vector fields Yj. 

• Parallel transport for T^,, is defined by 

P,oMi^'),...{u;P),,{Y,),...iY,),) = 

for any covectors {uj'')^ G T*M„ and vectors {Yj)^ G T^Min- 
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The covariant derivative of a tensor field T G 7^^(M„, R'") at a point may be 
described in terms of parallel transport: let Zq G M„ and u G T^gM^ be given. 
With the geodesic curve c(t) = expzoitu) we have 



DT{zo)u = lim J {P,^\t)T,^t) - T„ 



We now give Taylor's formula. A reference is Dieudonne [7], Chap. XVIII-6, 
where the case of functions is considered. Tensors are treated similarly. We have 

Theorem 2.1 (Taylor formula) For any tensor field T G 7^^(M„,M'^), zq, 
z G M„ with z in a certain neighborhood about Zq, and u G T^qM„ such that 
z = exp^^(M) we have 



Tiz) 



^fc=o ■ / 



Taking the /— th covariant derivative in 12.11 gives the following: 
Corollary 2.1 With the same hypothesis, for any I > 0, we have 

D^nz)=(f2l^D>^^^T{zo)u'APz,z,. 
\k=o ' J 

The neighborhood of zq in Theorem 12.11 and Corollary 12.11 is given by the 
radius of injectivity at Zq and by the disks of convergence of the Taylor series of 
the coordinates of the tensor field T in a local chart about zq. In the following we 
relate it to 7(/, z). Let / : M„ — >■ M" be an analytic map (resp. X : M„ — > TM„ 
an analytic vector field). As an immediate consequence of the definition of 7(/, z) 
(resp. 7(X, z)) we have: 

Proposition 2.1 The Taylor series at z & M„ for f and f (resp. X and 
D^X) converge in the ball about z with radius l/7(/, 2;) (resp. 1/'~^{X, z)). The- 
orem \2.1\ is valid for any z with 

c?(z,2;o) < min(r^o, l/7(/,2;)) (resp. min(r^Q, 1/7(X, 2;))). 

Proof. Taking a local chart it suffices to prove this theorem in the context of a 
map / : M" ^ W. Then, we use [2^ Chap. 8, Prop. 6. ■ 
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3 Proof of the R— 7— theorem. 



This proof is quite long and split in a series of lemmas. We frequently use the 
notations for the operator norm of the linear map A : E ^ F and ||^||_b 

when E = F. 



Lemma 3.1 Let x, y E M„ with d{x,y) < r^. We suppose that Df{x) is non- 
singular and that 

Df{xr'Df{y)=Py,, + BPy,,, 
with \\B\\t^m„ < for a certain r < 1. Then, Df{y) is non-singular and 



\\Df{yr'Df{x)\\T^M^,TyM^<Y 



— r 



Proof. Df{x)~^Df{y) = {idT^M„ + B)Py^x- Since ||i?||r,M„ < r < 1 the operator 
idT^M„+B is non-singular and its inverse satisfies ||«o?r^M„ +-B||T:rM„|| < 1/(1 ^^)- 
Then, we notice that parallel transport Py^x is an isometry. ■ 

Lemma 3.2 Let x, y & M„ with d{x,y) < r^- We suppose that Df{x) is non- 
singular and that 

V = d{x, y)-f{f, x) <1- 



Then, Df{y) is non-singular and 



\\Dfiy)-'Df{x)\\T^M^,TyM,. < 



Proof. Let u = exp^^{y). By Corollary 12.11 with / = 1 and T = / we get 



\k=o ■ / 



so that 



Df{x)-'Df{y) = Py,x + f g ^^z7yy^/(^)"' 



D'fix)u 



fe-i 



y,x 



P -\- BP 



y,x- 



Let us now give a bound for ||-B||t^ 



151 



oo 



k=2 



< 
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k=2 ' k=2 

oo oo ^ 

fc=2 k=2 ^ ' 

This last quantity is < 1 because v < 1 — The conclusion is obtained from 
Lemma f3. II ■ 

Lemma 3.3 Let z, C, with d{z,C) < r^. We suppose that /(C) = 0, -D/(C) 

is non-singular and 

u = d{z,CHf,C)<i-^. 

Then, 

\\Df{0-\Df{z)exp-\0 + f{zmc < 

Remark 3.1 Let u G T^M„ he such that exp^(M) = z. Let v = P^^zU G Tj-M^ be 
the parallel transport ofu along the geodesic between ( and z. Then, exp^(— 1>) = ( 
so that the expression the vector exp^^{Q is equal to ~u. 

Proof. Let u G T^M^ be such that exp^(-u) = z. From Taylor formula we get: 
f{z) = f{0 + Df{Ou + Y.liD'f{C)u' 

k>2 



and 

Df{z) = [DfiO + J2 ^^'/(O^'^'M P.X- 

\ k>2 ' J 



Notice /(C) = and P^^^exp^ ^{() = —u thus, 

Df{0-\Df{z) exp;i(C) + f{z)) = -J2 ^DfiCr'D'^fiOu' 

k>2 

and 

\\Df{C)-\Dfiz) exp;i(C) + /(z))||c < - 0''1'"Hc = 

k>2 

j:ik-iMf,cr^d{zxr = ^^^, 

k>2 ^ ' 

and we are done. ■ 
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Lemma 3.4 Let z, ( e M„ with d{z,() < ^c- We suppose that /(C) = 0, Df{() 
is non-singular and 

u = d{z,cHf,0<i-^- 

Then, 

||exp;i(iV;(z))-exp;i(C)|U< 



Proof. 

||exp;i(iV^(z)) -exp;i(C)|U = \\Df{z)-\Df{z)e^^-\0 + f{z))\l = 
\\Df{z)-'DmDm-\Df{z)exp-\0 + f{z))\U< 

\\Df{z)-'Dm\\T,M.,TMjDf{0-\Df{z)exp-\0+f{zmc < i^'J^^ 

by Lemma [3.21 and Lemma f3. 31 This achieves the proof. ■ 
Let us recall the definition of the geometric constant 

^ _ ^^^ d{exp^{u),exp^{v)) 

where the supremum is taken for all z G i?M„(C; i"c)' u, v & T^Mn with ||-u||2 
and \\v\\z < r^). 

Lemma 3.5 The following inequalities hold: Kc_ > 1 and 



2 ^ ~ ~2"' 



Moreover, if 



V < 



then 

- 2 

Proof. The constant Kc^ is necessarily > 1 because 

(i(exp^(0), exp^(t')) = ||0 — 
for any v in the ball of injectivity for C^. The second inequality comes from 



+ Ji^C +4^C + 2 1 1 72 
= = < ^ < 1 - — . 

2 isTc + 2 + J^l + 4fs:^ + 2 3 + V7 2 
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The third inequahty uses the fact K(^iy/%jj{iy) is increasing on the interval [0, 1 — ^[. 
■ 

Proof of Theorem 11.31 We are going to prove that 



for any k > with u = d{z, C)'y{f, ()■ The conclusion is then an easy consequence 
of the hypothesis and of Lemma f3. 51 We proceed by induction: the case = is 
evident. Then, 

/^From Lemma fH. 41 we get 

with = d{zk, C)7(/) 0- By the induction hypothesis 



4 Proof of the R— a— theorem. 

Let us first recall two definitions: P{f,z) = \\Df{z)^'^f{z)\\z and a = . For 
the proof of Theorem 11.41 with need some more lemmas. 

Lemma 4.1 For |r| < 1 and any integer k > 



kin (l-r)'=+i' 

Lemma 4.2 Let z, zi G M„ with d{z^Zi) < r^. We suppose that Df{z) is 
nonsingular and 

V = d{z,zi)-f{f,z) < 1 - — . 
Then, for any integer k > 2 

_ \\Df{z,)-^D'f{z,)\U ^ 1 fliLz)^'-' 



k\ i^ii^) \ 1 — 
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\\Dfiz)-'fiz^)\U</3if,z) + 



1-z/ ' 



Proof. Let u G T^M^ be such that exp^{u) = z. /^From Taylor formula (Theorem 
1^ we get: 



\i>o ' J 



so that 



\\Df{z,r'D''f{z,)\\ 
k\ 



< 



|P/(zi)-1D/(z)||t.,m„,t.m„ 



v/>0 ■ ■ / 

V^(z/) (1 - 



< 



Zl 



fc+1 



using Lemma IH.2I and Lemma I4.H the definition of 7, the fact that Pz,zi is an 
isometry and \\u\\z = d{z,zi). This proves the first inequality. Let us now prove 
the second one. 



\\Df{z)-'f{z,)\\ 



^Df{z)-'D'f{z)u'Pz,z, 

k>0 



< 



\\Df{z)-'f{z) 
and we are done. ■ 



\u\ 



J2^{f,z)''-'\\u\\':-' = P{f,z) + d{z,z^) 



k>l 



1 - U 



Lemma 4.3 Let z, Zi G M„ with d{z,Zi) < r^. We suppose that Df{z) is 
nonsingular and 

V = d{z,zi)-f{f,z) < 1 - ^ 



then 



/?(/,^i)< 
7(/,^i)< 



7(/,^) 



1 - u 



(l-i.)^(z/)' 



Proof. The first estimate is a consequence of Lemma 13.21 Lemma 14.21 and the 
following 

f3{f,z,) = \\Df{z,)-'f{z,)l^ < \\Df{z^)-'Df{z)l^^^ \\Df{z)-'fiz^)l. 
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The second inequality is an easy consequence of Lemma 131 



7(/, zi) = sup 



k>2 



k\ 



l/k~l 



< 



1 



sup 

k>2 



V'(z/)Vfc-l (1 _ zy)^(zy) 



because u < 1 — \/2/2 implies ipi^u) < 1 and the supremum is achieved for k = 2. 



Lemma 4.4 Let M„ be a complete Riemannian manifold. Then, for any x, 
y G M„ we have 

r^. - d{x, y) < Ty. 

Proof. To prove this inequality we show that exp^^ is injective in the ball about 
with radius — d{x,y) in TyMn. Let u G T^Mn be such that y = exp^u 
and \\u\\x = d{x,y). Let v, w E TyM„ be such that expy(f) = expy{w) and 
\\v\\y = \\w\\y < — d{x,y). Let P denote the parallel transport from TyMn to 
T^Mn. We have 

expy{v) = exp^{u + Pv) and exp^(w) = exp^('U + Pw). 

Moreover 

\\u + Pv\\x < \\u\\x + \\Pv\\x = d{x,y) + \\v\\y < d{x,y) + - d{x,y) = 

and a similar inequality holds with w. Since exp^ is injective in this ball we get 
u + Pv = u + Pw so that V = w and we are done. ■ 

Lemma 4.5 Let x G M„ he such that 

P{f,x) < sqTx and a{f,x) < ao. 

Then, for any y G M„ such that d{x,y) < aP{f,x) we have 

P{f,y)<ry. 

Proof. Let s be a positive real number and let us suppose that P{f,x) < sTx- 
Let y be such that d{x,y) < a(3{f,x). We have, by Lemma HT^ 



1 + 



Moreover, by Lemma [4.4^ 



P{f,x)< 



V 



1 - V 
2 



1 + 



< 



sr.x. 



< r?; + d[x, < r^ + cr/5(/, x) < r^ + asr^ 
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so that 

1 



1 — as 



as soon as as < 1. Thus 



ip^u) \ 1 — 1/ J 1 — as 



so that y) < Yy if 

il-u? a \ s 



\ 1 — v J 1 — as 
These conditions are satisfied when 

1 



1 + < 1 and as < 1. 



s < 



We also notice that 

V = d{x, y)-f{f, x) < ap{f, x)-f{f, x) = aa{f, x) < aao- 
Since the function 

iy^a+ ^ / 1 + 



\ 1 — U 

is increasing we obtain the following sufficient condition 

s < A ^ = 0.103621842 . . . 

(I-q-qq)^ 



a + 1 



ip{aao) \ 1—aao 



Lemma 4.6 Let z e and Zi = Nf(z). We suppose that 

V = d{z,zi)-i{f,z) < 1 - 

then 

• (5{f.z^)<^M.zfi{f.zl 



V,(l/)2 
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Proof. From Lemma f3. 21 we get the following 

(3{f,z,) = \\Df{z^)-'fiz^)\U < \\Df{z,)-'Dfiz)y^M.,T.,MjDf{z)-'f{z, 

^^'"^'■WDfizr'fiz,)]]^. 



I z 



< 



Let u G TzM.n be such that exp^(-u) = Zi. From Taylor formula 



f{z^) = f{z) + Df{z)u + l^D'f{z)u\ 

k>2 

Since Zi = Nf{z) we have f{z) + Df{z)u = so that 



\\Df{zr'f{z,)\U<J2l^\\Df{z)-'D>'f{ 



\u\\':< 



k>2 



,_^{f,z)d{z,z,Y_^{f,z)P{f,zY 



1-u 1 - z/ 



k>2 

This proves the first inequality. For the second we multiply together (3{f,zi) < 
^(^if^zflif^z), and -f{f,zi) < obtained in Lemma|01 ■ 

Proof of Theorem II. 4L Let us first introduce some more notations: Zk is the 

Newton sequence starting at zq = z, (3k = P{f,Zk) = d{zk,Zk+i), 7fc = lifyZk), 
ak = a{f,Zk) = •ykdi^Zk, Zk+i) and Vk the radius of injectivity at Zk- We shall 
prove, by induction, the following: 

• 2fc : < (i)''"Vo, 

• 3fc : A < Tfc. 

This will prove Theorem I1.4I These inequalities are clearly satisfied when 
k = 0. To prove l^+i we use 1^ and Lemma I^Hl 



^/'(ao)^ V2/ 2 V2/ V2. 

To prove 2k+i we use a similar argument: by Lemma WJ 

1 1 1 /I \ 2''-i /I \ 2"-! 



ip{ak) ^{ak) ^lj{ao) \2 J 

20 



Since "°f!" < we obtain 

and we are done. To prove 3^+1 we use the hypothesis, Lemma 14.51 and the 
following estimate: 

i=0 1=0 i=0 ^ ^ 

■ 

Proof of Theorems 11.51 and II. 6L The proofs of Theorems 11.51 and 11.61 are 

formally identical to the proofs of Theorems 11.31 and 11.41 respectively. The only 
difference is that X is an analytic vector field, and vector fields are 1-contravariant 
0-covariant tensor fields. Therefore, its fc-th derivative is a 1-contravariant k- 
covariant tensor field, instead of a A;-covariant tensorial vector field. 



5 Examples 

First example: the unit sphere. S*^ denotes the unit sphere in R""*"^, the 
tangent space T^-S" is the hyperplane in M""*"^ orthogonal to x, the Riemannian 
structure is given by the Euclidean structure of M*^"^^ and the Riemannian distance 
in §" is the arc length taken along great circles: 

d{x,y) = arccos(x, y). 

The exponential map at x G S" is given by 

sin ll-ull 



exp^(-uj = X cos IImII + u- 



\u\ 



for any u G T^^S". The radius of injectivity is equal to r^; = vr and the constant 
appearing in Definition 1 1.71 is K^. = 1 because S" has positive sectional curvature. 
Newton's method is given by 

u = -Df{xr'f{x), 

sin ll-ull 



\u\ 



Nf(x) = a; cos ||m|| + n- 
The size of the ball in Theorem 11.31 is equal to 
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Second example: the orthogonal group. On denotes the orthogonal group. 
The tangent space at the identity matrix idn is equal to An, the space of n by n 
antisymmetric matrices. More generally, the tangent space at u G 0„ is equal to 

This Riemannian structure is given by the usual scalar product of n by n matrices 

(a, h) = Trace(6"^a) 

for any u G and a, 6 G T„0„. The norm associated with this scalar product is 
the Frobenius norm and it is denoted by \\a\\F, while the usual spectral norm is 
denoted by ||a||. 0„ is a Lie group and this metric stucture is bi-invariant. Thus, 
the constant appearing in Definition 11.71 is = 1. 

The exponential map at -u G 0„ is given by the exponential of matrices: 

exp„(a) = uexp {u~^a) 

for any a G T„0„, with 

oo I, 



A:=0 

The inverse of the exponential is the logarithm 



°° l,k 



k=l 



defined for any matrix b with < 1. Thus, the inverse of the exponential map 

exp^^(&) =u\og{u'^b) 

is defined for any b G T„0„ such that ||ic/„ — 'u^-'^6|| < 1 which is satisfied if and 
only if ||m — b\\ < 1. Consequently, the radius of injectivity is = 1. Newton's 
method is given by 

Nf{u) =uexp{-u-'Df{u)-'f{u)). 
The size of the ball in Theorem 11.31 is equal to 

3-V7 



Rif, C) = min 



27(/,C) 



Third example: real projective space P„(R). Real projective space may be 
constructed as the quotient of S*" C R"+^ by the equivalence relation x = —x. 
Therefore, it has positive sectional curvature and hence = 1. The radius of 
injectivity of the exponential is 7r/2. 
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Newton's method on Pn(IR) niay be constructed as in the unit sphere (First 
example). 

The size of the ball in Theorem 11.31 is equal to 



i?(/,C) = min ttA 



3-^/7 



Fourth example: Hermitian manifolds 

Let M be an analytic, Hermitian n-dimensional manifold with metric (■, ■)h- 
In particular, M is also a 2n-dimensional analytic, Riemannian manifold with 
metric (-, ■) = Re((-,-)/^). 

If / : M — ^ C" is analytic, we define a real analytic function : M M?^ 
by/M(z) = Re(/(z)),Im(/(^)). 

Let Df{z) : T^M C" denote the complex derivative of /, in coordinates 
2:1, ■ ■ ■ , Zn- Then, 



Dh 

It follows that Df^{z 



" 1 i ' 


-1 - 


1 -i 





Df{z) _0_ 
Df{z] 



1 i 
1 -i 



u 

V 



fuiz) if and only if Df{z) ■ {u + iv) = f{z). 
Therefore, Newton's method in an Hermitian manifold is also given by 

iV^(^)=exp, {~Dfiz)-'fiz)) 

By the same argument, the invariants /?(/, z) = \\Df{z)^^f{z)\\ and 7(/, z) = 

l/k-l 



SUPfc>2 



Dfiz) 



k\ 



are equal, respectively, to the Riemannian invari- 
ants z) and 7(/m, z). 

Therefore, Theorems 11.31 to 11.61 apply verbatim to Hermitian manifolds and 
maps M ^ C^, or to vector fields on Hermitian manifolds. 



6 Alternative formulation of the R-7-Theorem 

In this section we investigate a question posed by an anonymous referee about the 
R-7- Theorem f Theorem II. 3|) . Using another proof we state it independently of 
the invariant K{Q introduced in Definition 11.71 We only state the R-7-Theorem 
for mappings. The theorem for vector fields is analogous. 

Theorem 6.1 (R— 7— theorem) There are constants z/q = 0.069778332. . . and 
to = 0.075262346 . . . such that the following statement is true. Let M„ he geodesi- 
cally complete and let f : M„ — ^ R" be analytic. Suppose that f{() = and 
DfiC) is an isomorphism. 
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Let 

i?(/,C)=min(torc,W7(/,C)) 

If d{z,() < -R(/)C); then the Newton sequence Zk = N^p{z) is defined for all 
k > 0, and 

d{z,,o<^l^l^ mz). 

Which theorem is the best? Theorem 11.31 or Theorem 16.1 1 / 
When M„ has a non-negative sectional curvature then, according to Corollary 
11.11 Theorem 11.31 gives a better result than Theorem 16.11 More generally, when 

K( < 2^+z^o-2 = 5.235326440 . . ., the expression (^K^ + 2 - ^K^ + AK^ + 2^ /2 

in the hypothesis of Theorem 11.31 is smaller than the constant vq. This means 
that, unless geodesies spread away by a factor larger than 5 in the relevant neigh- 
borhood, Theorem ll.3l is sharper than Theorem 16. II Otherwise Theorem 16 . 1 1 mav 
be more useful. 

We notice that, even if the formulation of Theorem 16.11 doesn't depend on 
both radius of (proved) quadratic convergence depend on the metric at 
C via 7(/, C) and consequently on the curvature at this point. This also proves 
that, like in the case of linear spaces, the main invariant which estimates the size 
of the quadratic attraction basin of a root is the invariant gamma. 

Proof of Theorem 16. It Let z/q be the smallest positive root of the equation 
Numerically, vq = 0.069778332 . . . Also, let 



to = ^_ = 0.075262346 



"^0 + ^(1.0) 

We assume that ( is such that f{() = and Df{() is an isomorphism. Let zq be 
such that d{(, zq) < i^o/'j{f, C)- Since z/q < 1 — v^/2, 

u = d{c,zoMf,0<i-^ 

and by Lemma f4. 31 we have 

(3{f,zo)<^dizoX) " 



and 

^^ 7(/,C) 



(1 - u)i;iu) 
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Therefore, 

"(/'^o)<^^<«o. 

In order to apply the R— a— Theorem, we need to show that P{f, zq) < Sor^g. 
Let < t < 1 be real number such that d{zo, C) < t^c- Like previously 

By Lemma [4.41 

rc < r^o + d{zo, () < r^^ + tr^ 

so that 



1-t 
and 

This gives zo) < sor^^ as soon as 

1-1^ t 

< So 



i^iu) i-t 

or, equivalently. 



So 

t < 



■^0 + ^ 



which is given by 

t < = to- 0.075262346. 

~ Q -L- l-'^O 

We can now apply the R-a-theorem (Theorem II .4^ : 
Hence, 
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7 Conclusions and suggestions for further re- 
search 



In this paper, we gave a generalization of a-theory for Riemannian (and therefore, 
Hermitian) manifolds. This generalization is subtle, due to the influence of new 
intrinsic factors, such as the radius of injectivity of the exponential and the 
curvature. 

We developped an intrinsic approach avoiding the use of local charts or iso- 
metric imbeddings. Except in the case of submanifolds, such imbeddings are 
often artificial and they lead to high dimensional problems, roughly speaking 
for a dimension n manifold according to Nash's Embedding Theorem. 

Our next objective is to implement this method. It is clear from the exam- 
ples we have in mind and from the work already done that we have to take into 
account the data structure describing the considered problem. See for example 
Celledoni-Iserles P] for Lie group methods, Edelman-Arias-Smith ^ for exam- 
ples of manifolds described by the action of a group on a set and Adler-Dedieu- 
Margulies-Martens-Shub for a product of special orthogonal groups. These three 
papers show three different ways to compute the exponential map associated with 
the considered manifold and therefore three different ways to implement Newton's 
method. 
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